Eager combining: a coherency protocol for increasing effective network and memory bandwidth in shared-memory multiprocessors
نویسندگان
چکیده
An excessive number of remote accesses or a non-uniform distribution of remote accesses can cause even well-designed multiprocessors to exhibit severe memory and network contention. Producer/consumer data generates a particularly common sharing pattern that results in a non-uniform distribution of references. In this paper we quantify the performance impact of producer/consumer sharing as a function of memory and network bandwidth, and argue that the contention caused by this form of sharing severely impacts performance on large-scale machines. We propose a new coherency protocol, called eager combining, which is designed to alleviate this contention. We use execution-driven simulation of parallel programs on a large-scale multiprocessor to show that eager combining can improve the performance of programs with pro-ducer/consumer data by a factor of 4 or more.
منابع مشابه
Eager Combining: a Coherency Protocol for Increasing Eeective Network and Memory Bandwidth in Shared-memory Multiprocessors
One common cause of poor performance in large-scale shared-memory multiprocessors is limited memory or interconnection network bandwidth. Even well-designed machines can exhibit band-width limitations when a program issues an excessive number of remote memory accesses or when remote accesses are distributed non-uniformly. While techniques for improving locality of reference are often successful...
متن کاملAdaptive Cache Coherency for Detecting Migratory Shared
Parallel programs exhibit a small number of distinct data-sharing patterns. A common data-sharing pattern, migratory access, is characterized by exclusive read and write access by one processor at a time to a shared datum. We describe a family of adaptive cache coherency protocols that dynamically identify migratory shared data in order to reduce the cost of moving them. The protocols use a sta...
متن کاملMeshes vs. Hypercubes: A case study for Distributed Shared-memory Multiprocessors
Distributed shared-memory multiprocessors (DSM) are gaining acceptance because they are easier to program than multicomputers. Recently proposed DSM use a direct interconnection network to access remote memory locations, making these architectures scalable. Most DSMs implement a cache coherence protocol by hardware. This protocol exchanges data and control messages through the interconnection n...
متن کاملAutomatic Generation of Veri able Cache Coherence
Performance modelling and veriication are vital steps in the development cycle of any cache coherency protocol. Two separate models are usually required to perform each analysis step and as protocols become increasingly complex each can become correspondingly unwieldy. We examine how stochastic process algebra can be used to describe cache coherency protocols in such a way as to allow both the ...
متن کاملReducing Controller Contention in Shared-Memory Multiprocessors Using Combining and Two-Phase Routing
In simple cache coherency protocols, serialisation can occur when many simultaneous accesses are made to data held in a single node, and when many accesses involve a common \home" node controller. This is ameliorated in various designs with a hierarchical or clustered structure. In this paper we investigate the idea of routing requests via an intermediate \proxy" node where combining is used to...
متن کامل